Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift

نویسندگان

  • Agustín Ortíz Díaz
  • José del Campo-Ávila
  • Gonzalo Ramos-Jiménez
  • Isvani Frías Blanco
  • Yailé Caballero Mota
  • Antonio Mustelier Hechavarría
  • Rafael Morales-Bueno
چکیده

The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically designed to deal with recurring concepts. FAE processes the learning examples in blocks of the same size, but it does not have to wait for the batch to be complete in order to adapt its base classification mechanism. FAE incorporates a drift detector to improve the handling of abrupt concept drifts and stores a set of inactive classifiers that represent old concepts, which are activated very quickly when these concepts reappear. We compare our new algorithm with various well-known learning algorithms, taking into account, common benchmark datasets. The experiments show promising results from the proposed algorithm (regarding accuracy and runtime), handling different types of concept drifts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New Ensemble Method for Classification of Data Streams

Classification of data streams has become an important area of data mining, as the number of applications facing these challenges increases. In this paper, we propose a new ensemble learning method for data stream classification in presence of concept drift. Our method is capable of detecting changes and adapting to new concepts which appears in the stream. Data stream classification; concept d...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Algorithm to handle Concept Drifting in Data Stream Mining

Data Stream Mining is the evolving field of research. Mining continuous data streams brings unique opportunities but also new challenges. This paper will describe and evaluate the proposed classifier which uses ensemble classifier along with the boosting concept. Adaptive windowing is also used for handling the data stream. Empirical study will show that the proposed classifier takes less memor...

متن کامل

Learning from Data Streams with Concept Drift

Increasing access to incredibly large, nonstationary datasets and corresponding demands to analyse these data has led to the development of new online algorithms for performing machine learning on data streams. An important feature of real-world data streams is " concept drift, " whereby the distributions underlying the data can change arbitrarily over time. The presence of concept drift in a d...

متن کامل

Fast and Light Boosting for Adaptive Mining of Data Streams

Supporting continuous mining queries on data streams requires algorithms that (i) are fast, (ii) make light demands on memory resources, and (iii) are easily to adapt to concept drift. We propose a novel boosting ensemble method that achieves these objectives. The technique is based on a dynamic sample-weight assignment scheme that achieves the accuracy of traditional boosting without requiring...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015